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1. Summary 

A paper by Schmidt and Lipson [31 H] introduced the idea that "free-form natural laws" 
can be learned from experimental measurements in a physical system using symbolic 
regression algorithms. The most important component of that work is a fitness function 
involving the pair-wise derivatives of time-series measurements of system state-space data, 
which is purported to contain no assumptions about physical laws. After a thorough 
examination of their paper |3] and supplemental materials [4j, we submitted a technical 
comment to Science, which was eventually rejected. We state the summary of the findings 
from our investigation of [31 H] here: 

• The paper makes nonstandard use of mathematical terms and symbols such as 
"dependent" and "independent" variables, "symbolic derivative" , "differential re- 
lationships" , "law equation", "|f", "f^", and lacks clear definitions for concepts. 

• No theoretical justification is provided for their methods. 

• The proposed fitness function [?', Equation S8] is flat for general systems. 

• An alteration of their fitness function for higher order systems is able to find 
Hamiltonians and special classes of Lagrangians, but not general Lagrangians. 

• Previous related work is not cited. Symbolic equation finding for time-series data 
appeared in [I], while addressed fundamental issues with the approach. 

• A direct incorporation of Hamilton's equations into a fitness function finds the 
(unique) Hamiltonian of a system. One also finds Lagrangians by incorporating 
the Euler-Lagrange equations into such a function. 

If the fitness function for systems with more than two variables [H Equation S8] is rein- 
terpreted (as discussed in Section |3]), then the paradigm can discover (non-canonical) com- 
positions of Hamiltonians with differentiable functions and special classes of Lagrangians. 
This follows from the specific form of the fitness function: it is either encoding a con- 
sequence of Hamilton's equations of motion, Eqn. ([T]), or Newton's 2nd law for a force 
arising from a potential, Eqn. ([s]). In particular, the fitness directly incorporates laws of 
physics. Thus, a major claim that "[wjithout any prior knowledge about physics . . . the 
algorithm discovered Hamiltonians, Lagrangians and other laws" [3] appears to be false. 

The organization of this document is as follows. In Section [2| we prove mathematically 
that the general fitness function of the authors is inadequate. In Section [3| we explain 
how a different fitness function than the one described in [U Equation S8] might have 
been used by Schmidt and Lipson to obtain their results. We also argue how physical 
laws are encoded in this measure. Finally, in Section |4| we offer another approach to the 
fitness which gives the unique Hamiltonian of a system as well as general Lagrangians. 
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2. A FLAT FITNESS FUNCTION 



The authors of [3] search for "conservation law equations" between measured variables 
in a physical system by performing symbolic regression over special function classes. The 
basic ideas and building blocks are represented in [3l Figure 2]. Given a possible con- 
servation function /, its fitness with respect to the data is calculated; functions with 
high fitness are then mated and mutated according to standard (genetic) symbolic re- 
gression routines. The authors' fitness measure for higher order systems [4^ Equation S8] 
(as extracted from a careful reading of [1]), however, is provably inadequate. We first go 
through the mathematical details that support this assertion. We then show how Schmidt 
and Lipson had carried out this argument explicitly for the Hamiltonian / of a double 
pendulum system [H Section S3]. The authors intended for this calculation to verify that, 
in this case, their fitness measure correctly identified the conservation law. As we now 
show, however, any function / would have produced this same fitness. 

Given a function f{qi, . . . ,qd;qi, . . . ,qd) in d > 1 generalized coordinates, a system 
trajectory 

r = {{q,{t),...,qd{t),qi{t),...,qa{t)):te[0,a]}, 

and a pair of system variables x,y E {qi, . . . , g^, gi, . . . , q^}, Schmidt and Lipson define 
the following three quantities (see [31 Figure 2], [H Equation S4], |H Section S2], and [U 
Section S3]): 



6y dy dx dy^ 6x dx dy dx^ ^yipamng 

Here, the quantity ^ is standard notation for the partial derivative of the function / 
with respect to the variable x and the quantity ^ (resp. ^) is the rate of change of x 
with respect to y along the trajectory F: 

dx x(t) dx{t) ^dy{t) 
dy y{t) dt dt 

The pair {x, y} is called a "variable pairing" , and the "measure of predictive ability" 
of a potential conservation law / is given by [4, p. 5]: 

^ ' dxk 6xk, 



(2) min ]-^X^lo; 

pairing I /V ' ' 



N 

k= 



1 + 



dyk 5yk '^"^^""3 



The symbol ^ (similarly for ^) is evaluation of ^ at discretized time step A; of F. 

■J dyk ^ 5yk ' dy ^ 

We claim that for any pairing {x, ?/} and any function /(gi, . . . , g^; gi, . . . , g^), the 
quantity ^{pairing is identically equal to ^ along all points of the trajectory. This implies 
that the fitness function ^ evaluates to zero for every function /. The following is the 
straightforward proof. For each pairing {x,y}, we have: 

r df.dfdx , 9f(d^\^<9l dfdy.df 

(^3) ^1 _ dy^ dxdy _ dx dy \dy J ^ dx _ dx dy dx + dx _ dX ^ _ dx^ 

I pairing ~ df_ \ df_dy ~ df_ \ df_dy ~ dv ^ -\_ ^Idy ~ ~ dv 

" ~r y ~r y fir.- ~r fi^. ^ ^ 



In Section S3 of [4j , the authors provide "an example calculation of a partial derivative 
pair for a double pendulum Hamiltonian" f {61,62] 001,002): 

(4) "/ = uf + ul + U1U2 cos(6'i - 62) - cos 61 - cos 62.'' 

They compute ^ and ^ for the variable pairing {61,62}, writing: 

"-^ = - a;iu;2 sin(6'i - 62) ■ ( 1 - ) + sin 6^1 + sin6'2 
d6i \ A61J A61 

— = - 00,0:2 sin{6, - ^2) ■ - 1 j + ^ sin ^1 + sin 62" . 

Equation ([s]) is precisely definition Q with x and y being 6*1 and 62, respectively. The 
authors then go on to calculate that the ratio jg^/jg^ upon simplification is They 
claim this shows that the "partial derivative ratio resolves numerically to our estimated 
partial derivative pair from the experimental data, relating Eqns. (SI) and (S2)." As we 
proved here in every function / gives an equality between "Eqns. (SI) and (S2)" in 
this way. Moreover, this is the case for each variable pairing, implying a flat fitness using 
the fitness function in [3l H] for any /. 

3. How Schmidt and Lipson conceivably arrived at their results 

We believe that Schmidt and Lipson are choosing a fitness measure M(/) with the 
following property: it is very large for (potential law) functions f = f {qi, . . . , qa] qi, ■ ■ ■ , qd) 
(with d > 1) when 

ra\ df df dx dy . 

(dj -:—/ —— ± — — IS close to zero 

^ ' dy' dx dt' dt 

for some pair {x, y} C {q,, . . . , g^, gi, . . . , q^} of variables, and some choice of sign 
Given that their fitness function for two variables {d = 1) satisfies this property (with the 
symbol "|^" interpreted as a partial derivative |^), it is likely that (6) was the one used 
in their experiments. It remains to understand how making the expression in ^ small 
could find laws in the system. 

Consider first the possibility that y = x and x is a coordinate. Then, expression ^ 
using the plus "+" sign is zero when 

(7) • ^ + ^ = 

dy dx 

Observe that if / = is the Hamiltonian of the systern, then / satisfies Hamilton's 
equations: = —y, ^ = y. In particular, "H solves (7). Notice also that for any 
differentiable function (7 : M — )■ M, we have that / = gCH) also satisfies Thus, a high 
fitness should correspond to laws of the form g{'H); moreover, since '^^^ = fl''(^)^ = 0, 
these functions would be constants of motion. Since the class of solutions to ([T]) is so large 
(it contains, for instance all power series in H), it is unclear why it would hone in on a 



""^Quoting end of [JJ Section SI]: "The partial derivative pairs define a cloud of line segments in phase 
space, therefore we are only interested in matching the line but not necessarily the direction of the line. 
Negating the Ax / Ay term or taking the absolute value of both can affect the signs of terms in the optimal 
law equation (for example, sign differences between Lagrangian and Hamiltonian equations)." 
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particular such function /, except that possibly the small-height operation tree setup of 
[3] constrains the complexity of / so much that it is biased to find scalar multiples of l-L. 

If instead, a minus "— " sign had been chosen in ([6]), then one has an equation of the 
form = y^. Suppose that / = £ = T — l^is the Lagrangian of the system with T 
the kinetic energy and V the potential energy. If it turns out that the kinetic energy T 
has the special form Yl'i=i ^fT^i^h then = my, and this fitness equation is equivalent to 

dV 

(8) rny = — , 

which is Newton's second law for a force arising from a potential. However, if T does not 
have the special form indicated, then ^ will not find a Lagrangian. As a simple example, 
the Lagrangian of a double pendulum system cannot be found in this way. This is perhaps 
why the authors of [3j were unable to find the Lagrangian for a double pendulum system 
in their experiments (but were able to find its Hamiltonian) . 

In conclusion, once a sign is chosen in a modified fitness function ([6]), it is possible to 
find (non-canonical) functions of Hamiltonians. When the opposite sign is chosen, it is 
possible in special circumstances to cirri VG dbt LcL grangians, but not possible in general. 
Nonetheless, in each case, the natural law determined by the vanishing of this fitness 
measure is a consequence of classical physics embedded in the measure. Given these 
considerations, it is unclear why one would not choose metrics specifically tailored to 
finding Hamiltonians and Lagrangians independently (such as those in Section |4] below). 



4. Fitness measures that find Hamiltonians and Lagrangians of a system 

Based on theoretical considerations, we propose fitness criteria for finding Lagrangians 
and Hamiltonians of a physical system. In classical physics, the Lagrangian £ of a system 
with (generalized) coordinates Xi, Xi {i = 1, . . . ,d) solves the Euler-Lagrange equations: 

d /dC\ dC _^ 

dt \ dxi J dxi 

If we now assume that £ is a function of the coordinates Xj and their time derivatives Xj, 
then by the chain rule, we have for each i and all t G [0, a]: 

/ ^ N v-^ d'^i^ dxj d'^C dxj dC 

(9) ^i'(A*)-E7wi5r^ + E 



dxidxi dt ^ dxjXi dt 



dxi 



If we are given a function / and discretized coordinate trajectory data coming from 
a physical system, then we may use ELi{f,t) as a measure of the Lagrangian fitness 
LFit of a potential conservation function /. That is, we compute symbolically the partial 
derivatives in (9) and evaluate numerically the time derivatives ^ and ^ over discretized 
time steps tk to calculate: 



(10) LFit(/):=-^ 



-J2^ogil + Y,\EW,tk)\ 

k=l V «=1 / 



Consider now the equations for a Hamiltonian H of a physical system (which hold at 
all times t of the trajectory and for each coordinate i): 



HQ.m, ') - P - = 0. HP.(H, t):=^ + §=0. 

oxi at oxi at 

Similar to above, we may measure the Hamiltonian fitness HFit of a function / over the 
discretized trajectory as follows: 



^ N / d d \ 

- ^log 1 + ^ \HQ,{f,t,)\ + \Hm,h)\ 

k=l \ i=l i=l / 



(11) HFit(/):=-^ 

It is straightforward to check that if HFit(/) = for all times t, then / is a conserved 
quantity (as / is then the canonical scale-dependent Hamiltonian of the system). Clearly, 
however, the methods proposed here are limited to analyzing measurements given in 
canonical or generalized coordinates. 




Figure 1. Fitness functions for a harmonic oscillator. The points in blue 
come from using ^ while those in green are from (11). 



For an experiment, we considered the simple harmonic oscillator system x = —bJ^x. Its 
equation of motion is x(t) = y4cos(co')f: + 0), and its Hamiltonian is 

1 / -'^•2 1-'^ 2 2 

li = -X + • 



Supposing candidate laws of the form / = ax'^ + f3xx^ + •yx^, we computed the fitness 
as a function of a, (3, and 7 using our metric (11) and the one from (I2I). We used the 



motion-tracked data supphed onhne in the supplementary materials of [1]. 



The optimal fitness using (11) occured when (a,/3,7) = (.5,0,3). In Figure |4| we 
plotted a linear section {a = r — s, /3 = r — 2s, 7 = 5r/2 + s} of the different fitness 
measures as a function of two varying parameters r and s (the peak for the Hamiltonian 
fitness occurs at r = 1, s = .5). As predicted by ([s]), the fitness given by ^ was fiat. 
When we used the altered fitness (|6|, however, we do achieve optimal fitness at the same 
parameters (not shown). 
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